第 7 课：Tool Use（工具调用）

必须掌握：

6.1 Function Calling 模式

JSON schema
tool 调用格式
强制模型只输出 action JSON（关键）

6.2 工具类型

搜索工具（web search）
请求 API（HTTP client）
数据库查询（SQL tool）
代码执行（Python sandbox）
浏览器控制（playwright/selenium）
文件系统工具
爬虫工具

6.3 工具安全

防止执行危险指令
取消无限循环
沙盒（Python exec sandbox）

6.4 多工具选择（Tool Selection）

让模型自动判断：

search / browse / run_python / summarize / finish

（一） Tool Calling

在 2025 年以前，模型需要以 json 格式返回文本，然后我们需要对文本 json 化后，才能提取模型判断要使用的工具和输入。但是其实文本json化是一个很耗时的步骤。因此从 2025年开始，多家厂商的模型不再“描述要调用什么工具”，而是“直接请求调用工具”。

你前面学的这套👇

Thought:
Action:
Action Input:

json.loads(raw)
validate_decision(...)

这是 2023–2024 初期的 ReAct / Agent 工程方案：用 prompt 假装一个协议，再用代码当“编译器”，其实无法百分百保证稳定。

而现代 Tool Calling 是这样工作的：

步骤一：在 API 调用中“注册工具”

以 OpenAI / Comet / OpenRouter 一类接口为例（伪代码）：

tools = [
    {
        "type": "function",
        "function": {
            "name": "search",
            "description": "Search the web for information",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {"type": "string"}
                },
                "required": ["query"]
            }
        }
    }
]

👉 注意：这里已经 定义了参数 schema（query 是 string）

步骤二：模型决定要不要调用工具

你发消息：

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "What is an agent?"}],
    tools=tools
)

如果模型决定调用工具，模型不会输出文字，而是返回：

{
  "choices": [
    {
      "message": {
        "tool_calls": [
          {
            "id": "call_abc123",
            "type": "function",
            "function": {
              "name": "search",
              "arguments": {
                "query": "definition of agent in AI"
              }
            }
          }
        ]
      }
    }
  ]
}

👉 这里没有“解析 JSON 的不确定性” 👉 这是 API 层保证结构正确的

步骤三：执行工具

tool_call = response.choices[0].message.tool_calls[0]
result = search(**tool_call.function.arguments)

步骤四：把工具结果发回给模型

messages.append({
    "role": "tool",
    "tool_call_id": tool_call.id,
    "content": result
})

然后再调用一次模型，它就会 基于 Observation 给 final answer。

说白了：

Tool calling = 把“Action / Action Input”从 prompt 协议，升级成 API 协议

必须掌握：​

6.1 Function Calling 模式​

6.2 工具类型​

6.3 工具安全​

6.4 多工具选择（Tool Selection）​

（一） Tool Calling​

步骤一：在 API 调用中“注册工具”​

步骤二：模型决定要不要调用工具​

步骤三：执行工具​

步骤四：把工具结果发回给模型​